Patterns of link reciprocity in directed networks 
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We address the problem of link reciprocity, the non-random presence of two mutual links between 
pairs of vertices. We propose a new measure of reciprocity that allows the ordering of networks 
according to their actual degree of correlation between mutual links. We find that real networks are 
always either correlated or anticorrelated, and that networks of the same type (economic, social, 
cellular, financial, ecological, etc.) display similar values of the reciprocity. The observed patterns 
are not reproduced by current models. This leads us to introduce a more general framework where 
mutual links occur with a conditional connection probability. In some of the studied networks we 
discuss the form of the conditional connection probability and the size dependence of the reciprocity. 
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The recent discovery of a complex network structure 
in many different physical, biological and socioeconomic 
systems has triggered an increasing effort in understand- 
ing the basic mechanisms determining the observed topo- 
logical organization of networks [1,2]. Nontrivial proper- 
ties such as a scale-free character, clustering, and correla- 
tions between vertex degrees are now widely documented 
in real networks, motivating an intense theoretical activ- 
ity concerned with network modelling [1-3]. 

In the present paper we focus on a peculiar type of cor- 
relation present in directed networks: link reciprocity, or 
the tendency of vertex pairs to form mutual connections 
between each other [4] . In other words, we are interested 
in determining whether double links (with opposite di- 
rections) occur between vertex pairs more or less often 
than expected by chance. This problem is fundamental 
for several reasons. Firstly, if the network supports some 
propagation process (such as the spreading of viruses in 
e-mail networks [5,6] or the iterative exploration of Web 
pages in the WWW [7]), then the presence of mutual links 
will clearly speed up the process and increase the possi- 
bility of reaching target vertices from an initial one. By 
contrast, if the network mediates the exchange of some 
good, such as wealth in the World Trade Web [8-10] or 
nutrients in food webs [11,12], then any two mutual links 
will tend to balance the flow determined by the presence 
of each other. The reciprocity also tells us how much in- 
formation is lost when a directed network is regarded as 
undirected (as often done, for instance when measuring 
the clustering coefficient [1,5,6,8,9]). Finally, detecting 
nontrivial patterns of reciprocity is interesting by itself, 
since it can reveal possible mechanisms of social, biologi- 
cal or different nature that systematically act as organiz- 
ing principles shaping the observed network topology. 

In general, directed networks range between the two 
extremes of a purely bidirectional one (such as the Inter- 
net, where information always travels both ways along 
computer cables) and of a purely unidirectional one (such 
as citation networks [1] , where recent papers can cite less 
recent ones while the opposite cannot occur). A tradi- 



tional way of quantifying where a real network lies be- 
tween such extremes is measuring its reciprocity r as the 
ratio of the number of links pointing in both directions 
If* to the total number of links L [4,5,8]: 

Clearly, r = 1 for a purely bidirectional network while 
r = for a purely unidirectional one. In general, the 
value of r represents the average probability that a link 
is reciprocated. Social networks [4], email networks [5], 
the WWW [5] and the World Trade Web [8] were recently 
found to display an intermediate value of r. 

However, the above definition of reciprocity poses var- 
ious conceptual problems that we would like to highlight 
before proceeding with a systematic analysis of real net- 
works. Firstly, the measured value of r must be com- 
pared with the value r rand expected in a random graph 
with the same number of vertices and links in order to 
assess if mutual links occur more or less (or just as) often 
than expected by chance [5] . This means that r has only 
a relative meaning and does not carry complete informa- 
tion by itself. Secondly, and consequently, the definition 
(1) does not allow a clear ordering of different networks 
with respect to their actual degree of reciprocity. To see 
this, note that r rand is larger in a network with larger link 
density (since mutual connections occur by chance more 
often in a network with more links), and as a consequence 
it is impossible to compare the values of r for networks 
with different density, since they have distinct reference 
values. Finally note that, even in two networks with 
the same density, the definition (1) can give inconsistent 
results if L includes the number of self-loops (links start- 
ing and ending at the same vertex). Since the latter can 
never occur in mutual pairs, while their number can vary 
significantly across different networks, a finer measure of 
reciprocity should exclude them from the potential set 
of mutual connections (hence L should be defined as the 
number of links minus that of self-loops). 

In order to avoid the aforementioned problems, we pro- 
pose a new definition of reciprocity (denoted as p to avoid 
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confusion with r) as the correlation coefficient between 
the entries of the adjacency matrix of a directed graph 
(dij = 1 if a link from i to j is there, and = if not): 



P = 



Tn^jiaiJ -a)(aji-a) 



(2) 



where the average value a = Ej^j a ij/N(N — 1) = 
L/N(N—1) measures the ratio of observed to possible di- 
rected links (link density), and self- loops are now and in 
the following excluded from L, since i ^= j in the sums ap- 
pearing in eq.(2). Note that with such a choice r rand = a, 
since in an uncorrelated network the average probability 
of finding a reciprocal link between two connected ver- 
tices is simply equal to the average probability of finding 
a link between any two vertices, which is given by a. 

Although the above definition appears much more 
complicated than eq.(l), it reduces to a very sim- 
ple expression. Indeed, since E«j a ij a ji = and 
E^j a % = Ei^j <Hj = L, eq.(2) simply gives 
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1 - a 
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(3) 



The correlation coefficient p is free from the conceptual 
problems mentioned above, since it is an absolute quan- 
tity which directly allows to distinguish between recip- 
rocal (p > 0) and antireciprocal (p < 0) networks, with 
mutual links occuring more and less often than random 
respectively. In this respect, p is similar to the assorta- 
tivity coefficient [3] which allows to distinguish between 
assortative or disassortative networks. The neutral or 
areciprocal case corresponds to p = 0. Note that if all 
links occur in reciprocal pairs one has p = 1 as expected. 
However, if =0 one has p — p m i n where 

Pmin = ~z T (4) 



which is always from p = — 1 unless a = 1/2. This occurs 
because in order to have perfect anticorrelation (a^ = 1 
whenever dji = 0) there must be the same number of zero 
and nonzero elements of a^ , or in other words half the 
maximum possible number of links in the network. This 
is another remarkable advantage of using p, since it incor- 
porates the idea that complete antireciprocity (L^ = 0) 
is more statistically significant in networks with larger 
density, while it has to be regarded as a less pronounced 
effect in sparser networks. Also note that the expression 
for p m i n only makes sense if a < 1/2, since with higher 
link density it is impossible to have = and the 
minimum reciprocity is no longer given by eq.(4) (val- 
ues of a larger than 1 /2 are observed for the most recent 
data of the World Trade Web shown below). Finally 
note that the definition (2) allows a direct generalization 
to weighted networks or graphs with multiple edges by 
substituting a^- with any matrix Wij . 

As in ref. [3], we can evaluate the standard deviation 
<t p for p in terms of the values pij obtained when any 
(single or not) link between vertices i and j is removed: 



^ = £(p-^) 2 (5) 

i<j 

= L~{p-p"f + {L-L~){p-p-f 

where p^ = ^ L ~ 2 \-(l_1^/ ( n{n-i) ( ' N ~ 1 ' > 1S tne value 
of p when a pair of mutual links is removed and 

p~* = - — L ^S(l-i)/n I N ^TT ~ 1 ' ^ s tnc vamc 01 P when the 
link between two singly connected vertices is removed. 

We can now proceed with the analysis of the reci- 
procity in a coherent fashion. Table I shows the values 
of p computed on 133 real networks. The most striking 
result is that, when ordered by decreasing values of p as 
shown in the tabic, all networks result clearly arranged in 
groups of the same kind. The most correlated system is 
the international import / export network or World Trade 
Web (WTW), displaying 0.68 < p < 0.95 for each of 
its 53 annual snapshots [10] in the time interval 1948- 
2000 (more details on this system are given below) . The 
WTW is followed by a portion of the WWW [7] and by 
two versions of the neural network of the nematode C. el- 
egans [13,14] (one where the vertices are different neuron 
classes and one where they are single neurons). For the 
two neural networks, we find that the reciprocity is pre- 
served (pneurons = 0.17 ±0.02 and Pdasses = 0.18 ±0.04) 

even after removing the links corresponding to gap junc- 
tions (which, differently from the chemical synapses, are 
intrinsically bidirectional [13,14]). We then have two dif- 
ferent e-mail networks (one built from the address books 
of users [5] and one from the actual exchange of messages 
[6] in two different Universities). The little difference in 
their values of p suggests the presence of a similar un- 
derlying social structure between pairs of users, either 
appearing in each other's address book or mutually ex- 
changing actual messages. A similar consideration ap- 
plies to the two word association networks [15] (one based 
on the relations between the terms of the Online Dictio- 
nary of Library and Information Science and one on the 
empirical free associations between words collected in the 
Edinburgh Associative Thesaurus), since completely free 
associations between words seem to reproduce most of 
the mutuality present in a network with logically or se- 
mantically linked terms, an interesting effect probably re- 
lated to some intrinsic psychological factor. The weakly 
correlated range 0.006 < p < 0.052 is covered by the 43 
cellular networks of ref. [16], where reciprocity is related 
to the potential reversibility of biochemical reactions. Fi- 
nally, we find that the antireciprocal region p < hosts 
the shareholding networks corresponding to two US fi- 
nancial markets [17] and 28 different food webs: the 22 
largest ones of ref. [11] and the six ones studied in ref. 
[12]. We note that often p = p m i n for both classes of net- 
works, highlighting the tendency of companies to avoid 
mutual financial ownerships and the scarce presence of 
mutualistic interactions (symbiosis) in ecological webs. 

This clear ordering of network classes according to 
their reciprocity suggests that in each class there is an in- 
herent mechanism yielding systematically similar values 
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most correlated (year 2000) 


0.952 
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(a > .5) 


least correlated (year 1948) 


0.68 


0.01 


-0.80 


World Wide Web [7] 


0.5165 


0.0006 


-0.0001 


Neural Networks [13,14] 








Neuron classes 


0.44 


0.03 


-0.04 


Neurons 
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Shareholding Networks [17] 








NYSE 


-0.0012 


0.0001 


-0.0012 


NASDAQ 


-0.0034 


0.0002 


-0.0034 


Food Webs [11,12] 








Silwood Park 


-0.0159 


0.0008 


-0.0159 


Grassland 


-0.018 


0.002 


-0.018 


Ythan Estuary 


-0.031 


0.005 


-0.034 


Little Rock Lake 


-0.044 


0.007 


-0.080 


Adirondack lakes (22 webs): 
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least correlated (L. Rainbow) 


-0.102 


0.007 


-0.102 


St. Marks Seagrass 


-0.105 


0.008 


-0.105 


St. Martin Island 


-0.13 


0.01 


-0.13 


Perfectly antireciprocal 


-1 




-1 



TABLE I. Values of p (in decreasing order), a p and p m i„ 
for several networks. For three large groups of networks, only 
the most and the least correlated ones are shown. 



of the reciprocity, or in other words that the reciprocity 
structure is a peculiar aspect of the topology of various 
directed networks. In all cases we find that real networks 
are either reciprocal or antireciprocal (p rea i ^ 0), in strik- 
ing contrast with current models that generally yield are- 
ciprocal networks (p m odei — 0). To see this, note that 
p aggregates the information about a deeper mechanism 
existing between each pair of vertices. Let p^ = p{i — > j) 
denote the probability that a link is drawn from vertex i 
to vertex j. In the general case, the probability pi^j of 
having a pair of mutual links between i and j is given by 



Pi^j = p(i -> j n i <— j) = njpj, = rjiPij 



(6) 



where is the conditional probability of having a link 
from i to j given that the mutual link from j to i is there: 



= p(i -> i) 



(7) 



Note that (ry) = '^Z i ^jTij/N{N — 1) = r, motivating 
the choice of the symbol. The expected value of p reads 
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Now, in most models the presence of the mutual link does 
not affect the connection probability, or in other words 
Tij = Pij and pi^j = PijPji. This yields p = in eq.(8), 
meaning that model networks are areciprocal. The only 
way to integrate reciprocity in the models is considering a 
nontrivial form (r^ ^ py ) of the conditional probability 
(hence the information required to generate the network 
is no longer specified by p^ alone). This allows to intro- 
duce, beyond Pi^j , the probability Pi— >j = Pij ~ r^pji of 
having a single link from i to j (and no reciprocal link 
from j to i), and the probability pi W j (fixed by the equal- 
ity Pi^j + Pi^j + Pi^j + Pwj = 1) of having no link be- 
tween i and j. The network can then be generated by 
drawing, for each single vertex pair, a link from i to j, a 
link from j to i, two mutual links or no link with proba- 
bilities Pi^j , Pi^j , Pi^j and Pwj respectively. 

The form of can be in principle very complicated, 
however in some of the studied networks we find that it 
is constant. In particular, we observe that in each snap- 
shot of the World Trade Web the in-degree kf 1 — J2j Pji 
and the out-degree k° ut = Y^iPij 01 a vertex are ap- 
proximately equal, meaning that pij ~ pji and hence 
nj w r.ji. Then we find (see fig.l) that for these net- 
works the reciprocal degree k\* — ^jPijfji (number of 
mutual link pairs of a vertex) is proportional to the total 

degree kJ = Y^jPij +Pji = 2 Y,jPij> or k V = ck I ■ This 
means « r ~ 2c, which is confirmed by the excellent 
agreement between the fitted values of c and the values 
r/2 = /2L obtained independently (see the legend in 
fig.l). A similar trend, even if with larger fluctuations, is 
displayed by the neural networks and the message-based 
email network (not shown). The other networks instead 
do not display any clear behaviour, meaning that r^- has 
in general a more complicated form. 

Another important problem is the size dependence 
p(N). As evident from eq.(3), this depends on both 
r(N) and a(N), which display different trends on dif- 
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FIG. 2. Plots of p(N), r{N) and a(N) on: a) the 43 cellular 
networks of ref. [16], b) the 28 food webs of refs. [11,12] and 
c) the 53 annual snapshots (1948-2000) of the WTW [10]. 

ferent classes of networks and therefore should be con- 
sidered separately for each class. We found three in- 
structive cases, as reported in fig. 2. For cellular networks 
a(JV) oc JV , implying p — > r as JV increases, therefore 
the asymptotic behaviour of p depends only on that of r, 
which is found to increase as JV increases. By contrast, 
r f» for food webs, so that in this case p(N) only de- 
pends on <x(JV), whose form is however unclear probably 
due to the small size of the webs [11], and therefore no 
clear trend is observed for p(JV) as well. The behaviour 
of the WTW is more complicated because both r and a 
contribute relevantly to p, and because its TV-dependence 
reflects its temporal evolution (JV increases monotoni- 
cally during the considered time interval). Between 1948 
and 1990, JV increases from 76 to 165 mainly since var- 
ious colonies become independent states, but a and r 
(and hence p) fluctuate about a roughly constant value. 
Then, after a sudden increase (JV > 180) in 1991 due 
to the formation of new states from the USSR, JV grows 
very slowly while a, r and p increase rapidly, an inter- 
esting signature of the faster globalization process of the 
economy and the tighter interdependence of world coun- 
tries. Indeed, the steep increase p — > 1 signals that the 
world economy is rapidly evolving towards an "ordered 
phase" where all trade relationships are bidirectional. 

More generally, this could suggest to promote p as an 
order parameter whose continuous variation from p < 1 
to p = 1 corresponds to a discontinuous change in the 
symmetry properties of the adjacency matrix (from a 
non-symmetric phase to a symmetric, maximally ordered 
one), a typical behaviour displayed within the theory of 
second-order phase transitions and critical phenomena. 
The most disordered phase corresponds instead to p = 0, 
since = pij and the knowledge of the event j — ► i adds 
no information on the event i — > j. The point p = —1 is 
again, even if not completely, informative since = 0. 

The results discussed here represent a first step towards 
characterizing the reciprocity structure of real networks 
and understanding its onset in terms of simple mecha- 
nisms. Our findings show that reciprocity is a common 
property of many networks, which is not captured by 
current models. Our framework provides a preliminary 
theoretical approach to this poorly studied problem. 
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